# Replication Package: Demographics and International Capital Flows

**Two-Paper Package, February 2026**

- Paper 1: "Demographics and Current Accounts: Evidence from 140 Countries" (`followup/`)
- Paper 2: "Where Does Demographic Capital Go? Bilateral Evidence from a Gravity Model" (`gravity_bilateral/`)

---

## 1. Software Requirements

- Python 3.10+
- Required packages: `numpy`, `pandas`, `statsmodels`, `scipy`, `matplotlib`, `wbgapi`, `imfp`, `requests`, `pypandoc`, `python-pptx`
- Install: `pip install numpy pandas statsmodels scipy matplotlib wbgapi imfp requests pypandoc python-pptx`

## 2. Directory Structure

```
demographics_capital_flows/
  followup/                          # Paper 1: Multilateral (140 countries)
    run_pipeline.py                  # Main orchestrator (8 steps)
    src/                             # Source modules
      download.py                    # Data acquisition (UN WPP, IMF WEO, PWT, WDI)
      demographics.py                # Fair-Dominguez polynomial construction
      macro.py                       # EBA control variable assembly
      interest_rates.py              # Bond yields (FRED, IMF IFS, OECD)
      model.py                       # PanelGLS estimation with AR(1)
      scenarios.py                   # Projections and GE clearing
      structural_breaks.py           # Rolling-window stability tests
      visualize.py, visualize_breaks.py  # Figure generation
      paper_tables.py                # Publication table formatting
    scripts/                         # Analysis scripts (phases 4-7)
      phase4cd_interactions_pension.py   # 3-way interactions, pension model
      phase4e_joint_tests_ge_decomp.py   # Joint F-tests, GE decomposition
      phase4f_horserace_jackknife.py     # KAOPEN vs GDP horse race, jackknife
      phase4g_multiple_testing_cca.py    # Bonferroni correction, CCA decomposition
      phase4h_cca_event_study.py         # CCA event study
      phase5bc_projections_ge.py         # 140-country projections with GE
      phase6_revision_robustness.py      # Reviewer-requested robustness
      phase7_referee_robustness.py       # Country FE, long diff, clustered SE, CCA diagnostics
      generate_paper_figures.py          # Publication figures
    data/raw/                        # Downloaded source data
    data/processed/
      full_panel.csv                 # Main estimation dataset (21 MB)
      macro_panel.csv                # Macro controls panel
    output/tables/                   # All regression outputs (48 CSV files)
    paper/
      paper_revised_20260219v2.md    # Paper manuscript
      references.bib                 # Bibliography

  gravity_bilateral/                 # Paper 2: Bilateral gravity
    scripts/
      phase1_download_data.py        # CPIS/CDIS download, CEPII gravity, panel construction
      phase2_estimation.py           # Gravity models (2b-2f)
      phase3_cca_robustness.py       # CCA exclusion, jackknife, extensive/intensive margin
      phase4_expand_yields.py        # Expand bond yield coverage to 35 countries
      phase4b_reestimate_expanded.py # Re-estimate with expanded yields
      phase5_ppml.py                 # PPML robustness
      phase5_projections.py          # Bilateral projections through 2050
      phase5b_projections_ge.py      # GE clearing overlay
      phase6_financial_center_robustness.py  # Financial center exclusion
      phase7_referee_robustness.py   # Clustered SEs, pair FE, structural PPML
      wald_tests.py                  # Wald tests on demographic distance
    src/
      model.py                       # PanelGLS (pair-level AR(1))
      macro.py                       # Macro variable construction
    data/raw/
      cpis_bilateral.csv             # IMF CPIS (297 MB)
      cdis_bilateral.csv             # IMF CDIS (17 MB)
      cepii_geodist.csv              # CEPII gravity distances
      expanded_bond_yields.csv       # 35-country bond yields
    data/processed/
      bilateral_panel.csv            # Main estimation dataset (311 MB)
    output/tables/                   # All regression outputs (19 CSV files)
    paper/
      gravity_paper_20260219v3.md    # Paper manuscript
      references.bib                 # Bibliography
      convert_to_docx.py             # DOCX generation
```

## 3. Replication Instructions

### Paper 1: Multilateral

```bash
cd followup/

# Step 1: Run main pipeline (downloads data, constructs variables, estimates models)
python3 run_pipeline.py

# Step 2: Run analysis scripts in order
python3 scripts/phase4cd_interactions_pension.py
python3 scripts/phase4e_joint_tests_ge_decomp.py
python3 scripts/phase4f_horserace_jackknife.py
python3 scripts/phase4g_multiple_testing_cca.py
python3 scripts/phase4h_cca_event_study.py
python3 scripts/phase5bc_projections_ge.py
python3 scripts/phase6_revision_robustness.py
python3 scripts/phase7_referee_robustness.py

# Step 3: Generate figures
python3 scripts/generate_paper_figures.py
```

### Paper 2: Gravity

```bash
cd gravity_bilateral/

# Step 1: Download and construct bilateral panel
python3 scripts/phase1_download_data.py

# Step 2: Estimate gravity models
python3 scripts/phase2_estimation.py

# Step 3-7: Robustness and extensions
python3 scripts/phase3_cca_robustness.py
python3 scripts/phase4_expand_yields.py
python3 scripts/phase4b_reestimate_expanded.py
python3 scripts/phase5_ppml.py
python3 scripts/phase5_projections.py
python3 scripts/phase5b_projections_ge.py
python3 scripts/phase6_financial_center_robustness.py
python3 scripts/phase7_referee_robustness.py
python3 scripts/wald_tests.py
```

## 4. Variable Definitions

### Demographic Variables (both papers)
| Variable | Definition | Source |
|:---------|:-----------|:-------|
| Z_1, Z_2, Z_3 | Fair-Dominguez polynomial: Z_k = sum_g(g^k * s_g), where s_g are GDP-weighted demeaned age shares | UN WPP 2024 |
| dZ_k (Paper 2) | Bilateral demographic distance: dZ_k_ij = Z_k_i - Z_k_j | Constructed |

### Control Variables (Paper 1)
| Variable | Definition | Source |
|:---------|:-----------|:-------|
| ca_gdp | Current account balance / GDP (%) | IMF WEO |
| fiscal_bal_gdp | General government net lending/borrowing / GDP (%), winsorized p1/p99 | IMF WEO |
| kaopen | Chinn-Ito financial openness index (normalized) | Chinn & Ito |
| expected_growth | 5-year-ahead IMF WEO GDP growth forecast | IMF WEO |
| nfa_gdp_lag | Previous year's net foreign assets / GDP | Lane-Milesi-Ferretti EWN |
| log_rel_opw | Log output per worker relative to US (PPP) | PWT 10.0 |
| health_exp_gdp | Government health expenditure / GDP (%) | WDI |
| log_lending_rate | log(1 + lending_rate/100) | IMF MFS, FRED |

### Gravity Variables (Paper 2)
| Variable | Definition | Source |
|:---------|:-----------|:-------|
| log_portfolio_total | log(bilateral portfolio investment position, USD) | IMF CPIS |
| log_dist | log(population-weighted bilateral distance, km) | CEPII GeoDist |
| contiguity | 1 if countries share a land border | CEPII |
| common_lang_official | 1 if countries share an official language | CEPII |
| colonial_ties | 1 if colonial relationship | CEPII |
| log_gdp_product | log(GDP_i * GDP_j) | IMF WEO |
| kaopen_j | Destination-country KAOPEN | Chinn & Ito |

## 5. Sample Construction

### Paper 1: 140-Country Panel
- **Countries**: 140 (49 EBA + 20 SSA + 10 EU completion + 61 additional)
- **Years**: 1970-2024 (estimation uses year <= 2024)
- **Missing data**: Listwise deletion; full model N = 2,746 (138 countries with complete controls)
- **Transformations**: fiscal_bal_gdp winsorized at p1/p99; log_lending_rate = log(1 + rate/100)
- **Projection panel**: UN WPP extends to 2101; projections filter year <= 2060

### Paper 2: Bilateral Panel
- **Construction**: CPIS reporter × partner × year merged with CEPII gravity variables
- **Zero handling**: CPIS reports zeros explicitly; log specification drops zeros (104,965 obs); PPML includes zeros (116,184 obs)
- **Coverage**: 82 reporters, ~200 partners, 2001-2024; CDIS: 2009-2024
- **Demographics merge**: Country-level Z_1, Z_2, Z_3 from followup/data/processed/full_panel.csv (year <= 2024)
- **KAOPEN merge**: Destination KAOPEN matched on (iso_d, year)
- **Pair ID**: Constructed as "ISO_o_ISO_d" string for panel entity

## 6. Estimation Details

### PanelGLS with AR(1) (both papers)
- Iterative Cochrane-Orcutt estimation
- Prais-Winsten transformation for first observation
- Entity-level AR(1) parameter (estimated from pooled residuals)
- Standard errors from GLS covariance matrix

### Key robustness specifications
| Test | Paper | Method |
|:-----|:------|:-------|
| Country FE | 1 | OLS with country + year dummies (within R²) |
| Long differences | 1 | OLS on non-overlapping 5yr/10yr changes |
| Country-clustered SE | 1 | OLS with clustering by iso3 |
| Pair-clustered SE | 2 | OLS with clustering by pair_id |
| Reporter-clustered SE | 2 | OLS with clustering by iso_o |
| Two-way clustered SE | 2 | Cameron-Gelbach-Miller (reporter + partner - pair) |
| Pair FE | 2 | OLS with pair dummies + year dummies |
| PPML | 2 | Poisson quasi-MLE on levels (subsample) |
| Extensive margin | 2 | Logit on 1(position > 0) |

## 7. Key Output Files

### Tables cited in Paper 1
| Paper Table | Output File |
|:------------|:-----------|
| Baseline (Table 1) | regression_baseline_demo_plus_eba_140.csv |
| Extended (Table 2) | regression_extended_plus_rates_140.csv |
| Jackknife (Table 8) | jackknife_baseline.csv |
| Projections (Table 9) | projection_table_140.csv |
| Country FE (Appendix E) | referee_robustness_p1.csv |
| Long diff (Appendix F) | referee_robustness_p1.csv |
| CCA diagnostics (Appendix H) | leave_one_country_out.csv, cca_observables_comparison.csv |
| Rate coverage (Appendix I) | rate_coverage_matrix.csv |

### Tables cited in Paper 2
| Paper Table | Output File |
|:------------|:-----------|
| Gravity baseline (Table 1) | gravity_results.csv |
| Rate decomposition (Table 2b) | mediation_decomposition.csv |
| Channel decomposition (Table 3) | mediation_decomposition.csv |
| CCA robustness (Table 4) | gravity_robustness.csv |
| PPML (Table 5c) | ppml_results.csv |
| Clustered SE (Table 5d) | referee_robustness.csv |
| Pair FE (Table 5e) | referee_robustness.csv |
| Projections (Tables 6-8) | bilateral_projections_ge.csv |

## 8. Persistence Diagnostics

The AR(1) parameters confirm high persistence in the key variables:

| Variable | Estimated rho | Paper |
|:---------|:-------------|:------|
| CA/GDP (multilateral) | 0.808 | 1 |
| log(bilateral portfolio) | 0.940 | 2 |

This motivates: (i) the AR(1) correction in the GLS estimator, (ii) long-difference specifications as robustness (Paper 1), and (iii) pair-clustered SEs that allow arbitrary within-entity dependence (Paper 2).
